Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent. This is because partial observability gives rise to hidden confounders in the causal graph. We break down the space of confounded imitation learning problems and identify three settings with different data requirements in which the correct imitation policy can be identified. We then introduce an algorithm for deconfounded imitation learning, which trains an inference model jointly with a latent-conditional policy. At test time, the agent alternates between updating its belief over the latent and acting under the belief. We show in theory and practice that this algorithm converges to the correct interventional policy, solves the confounding issue, and can under certain assumptions achieve an asymptotically optimal imitation performance.
translated by 谷歌翻译
元梯度提供了一种一般方法,以优化增强学习算法(RL)算法的元参数。元梯度的估计对于这些元算法的性能至关重要,并且已经在MAML式短距离元元RL问题的情况下进行了研究。在这种情况下,先前的工作调查了对RL目标的Hessian的估计,并通过进行抽样校正来解决信贷分配问题,以解决预先适应行为。但是,我们表明,例如由DICE及其变体实施的Hessian估计始终会增加偏差,还可以为元梯度估计增加差异。同时,在重要的长马设置中,元梯度估计的研究较少,在这种情况下,通过完整的内部优化轨迹的反向传播是不可行的。我们研究了截短的反向传播和采样校正引起的偏见和差异权衡,并与进化策略进行了比较,这是最近流行的长期替代策略。虽然先前的工作隐含地选择了这个偏见变化空间中的点,但我们解散了偏见和差异的来源,并提出了将现有估计器相互关联的经验研究。
translated by 谷歌翻译
一致性是一元的学习算法,保证了在一定条件下,它可以在测试时间适应任何任务的理论性能。一个悬而未决的问题是,是否以及如何一致性理论转化为实践,在比较不一致的算法。在本文中,我们经验调查的一组代表性元RL算法这个问题。我们发现,在理论上是一致的算法的确可以通常适应外的分布(OOD)的任务,而那些不一致不能,虽然他们可以在实践中仍然无法像勘探不佳的原因。我们进一步发现,理论上不一致的算法可以由通过不断更新的OOD任务的所有剂成分一致,并适应以及或优于原先一致的。我们的结论是理论的一致性确实是一个理想的财产,且不一致元-RL算法可以很容易地做出一致的,享受同样的好处。
translated by 谷歌翻译
我们在这项工作中的主要贡献是一个实证发现随机通用价值函数(GVF),即深度动作条件预测 - 随机观察到他们预测的观察的特征以及预测的操作顺序中 - 为强化学习(RL)问题形成良好的辅助任务。特别是,我们表明当用作辅助任务时,随机深度动作条件预测产生了产生控制性能的状态表示,其具有与最先进的手工制作的辅助任务相同的辅助辅助任务,如atari中的值预测,像素控制和卷曲和DeepMind实验室任务。在另一组实验中,我们将梯度从网络的RL部分停止到网络的状态代表性学习部分,也许令人惊讶的是,单独的辅助任务足以学习州表示足以超过最终的状态 - 训练的演员 - 评论家基线。我们在https://github.com/hwhitetooth/random_gvs ovensourced我们的代码。
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
可靠的点云数据对于机器人技术和自主驾驶应用程序中的感知任务\ textit {efextit {e.g。}至关重要。不利的天气会导致特定类型的噪声检测和范围(LIDAR)传感器数据,从而大大降低了点云的质量。为了解决这个问题,这封信提出了一种新颖的点云不利天气,使深度学习算法(4Denoisenet)。我们的算法利用了时间维度,与文献中深度学习不利的天气变质方法不同。与以前的工作相比,它的交集比联合度量的交点更好10 \%,并且在计算上更有效。这些结果是在我们的新型Snowkitti数据集上实现的,该数据集具有40000多个不良天气注释点云。此外,对加拿大不利驾驶条件数据集的强烈定性结果表明,对域移动和不同传感器内在的可推广性良好。
translated by 谷歌翻译
算法选择向导是有效且通用的工具,它们会自动选择有关该问题和可用计算资源的高级信息的优化算法,例如决策变量的数量和类型,最大程度的评估数量,并行评估等。艺术算法选择向导很复杂且难以改进。我们在这项工作中建议使用自动配置方法来通过找到构成它们的算法的更好配置来改善其性能。特别是,我们使用精英迭代赛车(IRACE)来找到特定人工基准测试的CMA配置,这些基准取代了Nevergrad平台提供的NGOPT向导中当前使用的手工制作的CMA配置。我们详细讨论了IRACE的设置,目的是生成在每个基准内的各种问题实例集合中都可以正常工作的配置。我们的方法也提高了NGOPT向导的性能,即使在不属于Irace的一部分的基准套件上。
translated by 谷歌翻译
Most AI systems are black boxes generating reasonable outputs for given inputs. Some domains, however, have explainability and trustworthiness requirements that cannot be directly met by these approaches. Various methods have therefore been developed to interpret black-box models after training. This paper advocates an alternative approach where the models are transparent and explainable to begin with. This approach, EVOTER, evolves rule-sets based on simple logical expressions. The approach is evaluated in several prediction/classification and prescription/policy search domains with and without a surrogate. It is shown to discover meaningful rule sets that perform similarly to black-box models. The rules can provide insight into the domain, and make biases hidden in the data explicit. It may also be possible to edit them directly to remove biases and add constraints. EVOTER thus forms a promising foundation for building trustworthy AI systems for real-world applications in the future.
translated by 谷歌翻译
本文描述了进化算法的固有力量。该功率取决于遗传编码的计算特性。有了一些编码,两个父母与简单的跨界操作员重新组合可以从儿童表型的任意分布中取样。此类编码在本文中称为\ emph {表达式编码}。通用函数近似值,包括遗传编程和神经网络的流行进化底物,可用于构建表达性编码。值得注意的是,这种方法不必仅应用于表型是一个函数的域:即使优化静态结构(例如二进制向量),也可以达到表现力。这样简单的设置使理论上表征表达性编码是可能的:在各种测试问题上,表达性编码被证明可以实现超过标准直接编码的超级指数收敛的速度。结论是,在诸如遗传编程,神经进化,遗传算法和理论之类的进化计算领域中,表达式编码可以成为理解和实现全部进化力量的关键。
translated by 谷歌翻译
命题满足(SAT)是一个NP完整的问题,它影响了许多研究领域,例如计划,验证和安全性。主流现代SAT求解器基于冲突驱动的子句学习(CDCL)算法。最近的工作旨在通过图神经网络(GNNS)产生的预测来改善其可变分支启发式方法来增强CDCL SAT求解器。但是,到目前为止,这种方法要么尚未使解决方案更有效,要么需要在线访问大量的GPU资源。为了使GNN改进实用,本文提出了一种称为Neurocomb的方法,该方法以两个见解为基础:(1)重要变量和条款的预测可以与动态分支相结合,为更有效的混合分支策略,(2)它是(2)它是足以在SAT解决开始之前仅查询神经模型一次。 NeuroComb被实施,以增强称为Minisat的经典CDCL求解器,以及最新的CDCL求解器,称为葡萄糖。结果,它允许Minisat在最近的SATCOMP-2021竞争问题设置中解决11%和葡萄糖更多的问题,仅计算资源需求只有一个GPU。因此,NeuroComb是通过机器学习改善SAT解决的有效和实用方法。
translated by 谷歌翻译